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Abstract. We develop a phase space distribution function approach to redshift space distortions 
(RSD), in which the redshift space density can be written as a sum over velocity moments of the dis- 
tribution function. These moments arc density weighted and have well defined physical interpretation: 
their lowest orders are density, momentum density, and stress energy density. The series expansion is 
convergent if k^u/aH < 1, where k is the wavevector, H the Hubble parameter, u the typical gravita- 
tional velocity and fj, = cos 0, with 9 being the angle between the Fourier mode and the line of sight. 
We perform an expansion of these velocity moments into helicity modes, which are eigenmodes under 
rotation around the axis of Fourier mode direction, generalizing the scalar, vector, tensor decomposi- 
tion of perturbations to an arbitrary order. We show that only equal helicity moments correlate and 
derive the angular dependence of the individual contributions to the redshift space power spectrum. 
We show that the dominant term of fi^ dependence on large scales is the cross-correlation between 
the density and scalar part of momentum density, which can be related to the time derivative of the 
matter power spectrum. Additional terms contributing to /x^ and dominating on small scales are the 
vector part of momentum density- momentum density correlations, the energy density-density correla- 
tions, and the scalar part of anisotropic stress density-density correlations. The second term is what 
is usually associated with the small scale Fingers-of-God damping and always suppresses power, but 
the first term comes with the opposite sign and always adds power. Similarly, we identify 7 terms 
contributing to dependence. Some of the advantages of the distribution function approach are that 
the series expansion converges on large scales and remains valid in multi-stream situations. We finish 
with a brief discussion of implications for RSD in galaxies relative to dark matter, highlighting the 
issue of scale dependent bias of velocity moments correlators. 
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1 Introduction 

Galaxy clustering has traditionally been one of the most important ways to extract cosmological 
information. Galaxies are not a faithful tracer of dark matter, as their clustering strength is biased 
relative to the dark matter. However, they are expected to follow the same gravitational potential as 
the dark matter and hence have the same velocities. This is not observable through angular clustering, 
which is only sensitive to correlations transverse to the line of sight. It is however detectable in redshift 
surveys, because the redshift of the galaxy does not provide information only on the radial distance, 
but also on the radial velocity through the Doppler shift. This induces anisotropics in the clustering, 
which are generically called redshift space distortions (RSD) [1]. They provide an opportunity to 
extract information on the dark matter clustering directly. On large scales clustering of galaxies 
along the line of sight is enhanced relative to the transverse direction due to peculiar motions and 
this allows one to determine the ratio of logarithmic rate of growth / to bias b [2]. Combining the 
statistics from different lines of sight one can eliminate the unknown bias and measure directly the 
logarithmic rate of growth times the amplitude. 

It has been argued that using RSD information could greatly increase our knowledge of cosmo- 
logical models, including tests of dark energy and general relativity [3-5]. Galaxy clustering has clear 
advantages over the alternatives such as weak lensing: it is intrinsically 3-dimensional, thus providing 
better statistics, and it has high signal to noise. While most of the predictions in the literature are 
model dependent, a generic statement can be made that if systematic effects were perfectly under- 
stood RSD would be one of the most powerful techniques for such studies. The main problem with 
RSD is that nonlinear velocity effects extend to rather large scales and give rise to a scale dependent 
and angular dependent clustering signal. It is easy to see these effects in any real redshift survey: one 
sees elongated features along the line of sight, called the fingers-of-god (FoG) effect, which are caused 
by random velocities inside virialized objects such as clusters, which scatter galaxies along the radial 
direction in redshift space, even if they have a localized spatial position in real space. This is just 
an extreme example and other related effects, such as nonlinear infall streaming motions, also cause 
nonlinear corrections. This means that one needs to understand these and separate them from the 
nonlinear evolution of the dark matter and from the nonlinear relation between the galaxies and the 
dark matter, both of which also give rise to a scale dependent bias [6]. 

Several recent studies have investigated these nonlinear effects [7-13], some limiting the analysis 
to dark matter only and some also including galaxies or halos. The common denominator of these 
studies is that they are based on various ansatzes for the scale and angular dependence of RSD, 
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typically combined with some perturbation theory analysis. This has the advantage of having just 
a few free parameters, so that if the ansatz is accurate one can model the effects accurately. The 
reverse is also true and the problem is that it is difficult to make general statements regarding the 
range of validity for any given model. There is another problem connected to perturbation theory, 
in that the usual perturbation theory makes a single stream approximation, which we know breaks 
down on small scales inside the virialized halos (indeed, FoG are a manifestation of multi-streaming 
on small scales). 

In this paper we present a different approach to RSD: we use a distribution function approach 
to show that one can make a series expansion of RSD, which is convergent on sufficiently large scales 
and we derive the most general form of RSD correlator allowed by the symmetries. In this paper we 
present the formal derivations and conceptual implications, reserving all the applications to future 
work. The structure of this paper is as follows: in section 2 we develop the distribution function 
approach to RSD and derive the helicity decomposition. In section 3 we discuss the power spectra, 
and use rotation symmetries to derive the most general form of the RSD correlator. We also discuss 
the lowest order contributions and connect them to physical quantities such as density, momentum, 
stress energy tensor etc. This is followed by a discussion in section 4. 



2 Redshift-space distortions from the distribution function 

The exact evolution of collisionless particles is described by the Vlasov equation [14]. Following the 
discussion by [15], we start from the distribution function of particles /(x, q, r) at a phase-space 
position (x, q) and at conformal time r in order to derive the perturbative redshift-space distortions. 
Here x is the comoving position and q — p/a ~ mu is the comoving momentum, where u = dx/dr. 
In the following we will omit the time dependence, i.e we will write /(x, q). The density field in real 
space is obtained by averaging the distribution function over momentum: 

pix) = mpjd^qf{x,q), (2.1) 

where mp is the particle mass and a — 1/(1 + z) is the scale factor (z is the redshift). In rcdshift 
space the position is distorted by peculiar velocities, thus the comoving redshift-space coordinate for 
a particle is given by s = x -I- f where f is the unit vector pointing along the observer's line of 

sight, M|| is the radial velocity, mpU\\ = q\\ = q • f, and % = aH, where H is the Hubble parameter. 
Then the mass density in redshift space is given by 

Ps (s) - mp j d^x d^q / (x, q) 5^ (s - X - f^) ^ nip j d\ / (s - f^, q) . (2.2) 

By Fourier transforming equation 2.2, we find 

Ps (k) = mp J d^x d^q / (x,q) e*k-x+»fc|i"ii /« 

= nip J d^x e'^ '' J d^q /(x,q)e*'^'ii"ii/^ , (2.3) 

where k is the wavevector in redshift space, corresponding to the redshift-space coordinate s. 
Now we expand the second integral in equation 2.3 as a Taylor series in fc|[U||/7{, 

mp I d\ f (x, q) e'^^il^ii/^ ^mp f d^q / (x, q) E 1? O^Fll/^)'' 



L=0 

L 



.L=0 ^ 



where p is the mean mass density and 

T|f(x) = ^ y'd3q/(x,q)4. 



(2.4) 



(2.5) 
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where in the last expression the integral over phase space assures that the quantity is defined in terms 
of a sum over all particles at position x. For a single stream at x this is just (1 + 5{x))u^ {x). These 
are thus radial components of the moments of the distribution function and the distribution function 
description allows for inclusion of both bulk velocities and multi-streamed velocities. Note that these 
quantities are mass-weighted, and so well-defined for any system: one just sums over all the particles 
in the system weighting each one by the appropriate power of their radial velocities. If the field needs 
to be defined on a grid a simple assignment scheme of particles to the grid suffices, and empty grid 
cells are assigned a value of 0. We note this to contrast it with volume weighted quantities which need 
to be defined even if there arc no particles assigned to a given grid cell, which is often impossible for 
sparse biased tracers, specially in underdcnsc regions. We return to this issue later. 
The Fourier component of the density fluctuation in redshift space is 

'^^(k) = El!(^)'^lf(k), (2.6) 

where T|j^(k) is the Fourier transform of T^^{x.). 

(k) = J d^x (x)e*'^-^. (2.7) 

For L = we have T^(k) = 5{\s.), the density fluctuation in real space. 
2.1 Angular decomposition of moments of distribution function 

The objects T^^{x.) introduced in equation 2.5 are radial components of moments of the the distribution 
function, which arc rank L tensors, 

T^^^.,..^.=Y j d\ li^, C^) U,,U,,.. .U,, . (2.8) 

The real-space density fleld corresponds to L = 0, i.e. zeroth moment, the L = 1 moment corresponds 
to the momentum density, L = 2 gives the stress energy density tensor etc. These objects are 
symmetric under exchange of any two indices and have (L + 1)(L + 2)/2 independent components. 
They can be decomposed into helicity cigenstatcs under rotation around k, as we do next. 

Since translational symmetry guarantees that each Fourier mode is only correlated with itself, 
we can work with each Fourier mode separately, and add them appropriately in the end when we 
discuss the power spectra. By symmetry we may take k to be along z-axis. We can decompose the 
distribution function into spherical harmonics, 

oo m—l 

f{k,q,0,<i>) = j2 E .friKq)yirn{e,c^), (2.9) 

1=0 m=-l 



where q is the amplitude of the momentum (often a term {~i)y is inserted into this expansion, 

but we will drop all such terms here). The components //"(k, 5) are helicity eigenmodes (i.e., eigen- 
modcs of angular momentum component in z-direction = —id/d4>) and under rotation by angle ip 
around the z-axis they transform as 

/r(k,g)' = e""'^/r(k,g). (2.10) 

This follows from the transformation properties of spherical harmonics. A quantity which transforms 
under rotation according to this equation is said to have helicity m. A quantity with helicity is 
called a scalar, that with helicity m = ±1 is called a vector and that with m = ±2 a tensor, but the 
expansion goes to arbitrary values of m. 
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Moments of the distribution function are defined in terms of integrals of velocity moments over 
the distribution function. We can define helicity eigenstates of moments of the distribution function 
as 

T^^.'"(k) = l^ Jq^dqu^friKq). (2.11) 

Note that each term J"^^'™ contains L powers of velocity u (u = |u|, and recall that q = mu). 

A general rank L tensor .^^ can be decomposed into 2i + 1 helicity eigenmodes T^^™ 

(m = ^L,..,L) and additional components formed by a product of a scalar with rank L — 2 
tensors. This gives additional 2{L — 2) + 1 helicity eigenmodes T^l™ (m = —{L — 2)..{L — 2)), and 
additional components formed again by and rank L — 4 tensor, which gives additional 2{L — 4) + 1 
hehcity eigenmodes T^l™ (m = —{L — 4)..(L — 4)) etc. 

For the lowest terms we have rank 1 tensor T/, momentum density, which is a 3- vector in the 
usual geometrical context, and can be decomposed into a m = helicity scalar component T^''^ 
and two m = ±1 helicity vector components T^'^^. A general symmetric rank 2 tensor T!^- has 6 
independent components. These can be decomposed into an isotropic rank helicity scalar term 
Tq''^ = (1 + (5)m^, which corresponds to the energy density, and 5 I ~ 2 components: one helicity 
scalar part of anisotropic stress tensor two helicity vector part of anisotropic stress tensor T"^'^^ 
and two helicity tensor part of anisotropic stress tensor T2 At L = 3 we have 10 independent 
components, of which 7 are y^^™^ i.e. I — "j^ rn ~ —3,. .,3, and 3 are T^^™" ^ I = rn — —1,0,1 
tensors, formed by taking isotropic and multiplying it with a 3- vector Ui, the latter of which can 
be decomposed into m = —1, 0, 1 components. 

One can show that 2L + l + 2(L-2) + l + 2(L-4) + l + ... = (L + l)(L + 2)/2, so this decomposition 
gives the required number of independent components of a general symmetric tensor of rank L. In 
analysis of general relativity it is customary to expand the metric and stress energy tensor into scalar 
(m = 0), vector (m = ±1) and tensor (m = ±2) helicity modes (SVT decomposition). No higher 
order helicity modes arc needed, since only tensors of rank 0, 1 and 2 enter into the description of 
the metric and energy momentum tensor. In contrast the moments of distribution function contain 
tensors of arbitrary rank and the expansion in equations 2.9,2.11 is the appropriate generalization of 
the SVT decomposition. 

So far we worked in the basis defined by k pointing in z direction. In general we are interested 
in computing the components of the moments in the radial direction f. If f is parallel to k then only 
m = components contribute, while for a general direction all of them do. The angular dependence of 
the moments is obtained by performing a rotation of the basis from z| |k to z'| |f. We can achieve this by 
rotating by (/) around z and then by 9 around the axis perpendicular to z, z', so in terms of the general 
rotation by 3 Euler angles we have T^^''" = X)"™ ^L,m' ('/'' 0)'^;^'™' where £'m.„i'(0, ^, 0') is the 
general rotation matrix of spin I associated with the 3 Euler angles (f>, 0, </>' (we do not need to perform 
the rotation around z' by (j)'). Since M|| is invariant under rotation around z'\\f only m' = survive. 
The rotation matrix is given by the spherical harmonics, Dq ,„((/), 0, 0) = ■\/47r/ {21 + l)Yi,„(6', 0). 
Combining all together we find 

m—l 

^lf(k)= E E "^'^^"(k)ll"(^»' (2.12) 

{l=L,L-2,..) m=-l 

where is a constant independent of angle whose numerical value will not be needed. 
3 Power spectra 

We will adopt a plane-parallel approximation, where only the angle between the line of sight and the 
Fourier mode needs to be specified. The redshift-space power spectrum is defined as ((5s(k)(5*(k')) = 
P'"'(k)SD(k - k'). Equation 2.6 gives, 

00 00 / _j ,L' / 'U \ L-\-L' 
L=OL'=0 \ / 



-4- 



where PLt.(k)(5(k - k') = (r|j^(k)(r*^'(k')). Note that Pll'0<-) = Pl'lO<-)* so that the total result 
is real valued, as expected. Thus we only need to consider the terms P£,i,/(k) with L < L', each of 
which conies with a factor of 2 if i ^ L' and 1 ii L = L'. We can also write fc||/fc = cos^^ = /i, 

Next we want to insert the helicity decomposition of equation 2.12 and consider the implications 
of rotational symmetry on the power spectrum. Each term Pii/(k) contains products of multipole 
moments 

r^™rz„(0, 0)[T;f ^"Vi^„,(^, <P)]* « e*^™-"')'^. (3.3) 

Upon averaging over the azimuthal angle (j) of Fourier modes all the terms with m ^ m' vanish. 
Another way to state this is that upon rotation by angle 4' the correlators pick up a term e*'™^™ 
and in order for the power spectrum to be rotationally invariant we require m = m' . Putting it all 
together we find 

PLL'{k)= E E (3.4) 

(i=L,L-2...) {l'=L\L'-2,..; l'>l) m=0 

where P™{fi = cos 6) are the associated Legendre polynomials, which determine the 6 angular de- 
pendence of the spherical harmonics, Yijn{0,(t)) = ^y {21 + ~ m)\/i7T{l + m)!P™(cos 0)e™"^. We 
absorbed all of the terms that depend on / and m and various constants into the definition of power 
spectra P;^//^ '™(fc), replaced the two helicity states ±m by a single one with m > 0, since their 6 
angular dependencies are the same, and we absorbed the factor of 2 into the definition of P;^//^ '"(fc). 
We also require V > I and absorb the factor of 2 into the definition of P/^i',^ '™(fc) since the two terms 

have the same angular structure. Note that due to statistical isotropy the spectra Pi^i',^ '™(fc) depend 
only on amplitude of k, i.e. we have 

P,';;^''"(fc) oc (T,^'"(k)(T,^''"(k))*). (3.5) 

All the angular structure is thus in associate Legendre polynomials P™(^). 

Equations 3.2 and 3.4 are the main result of this paper. They show that there exists a well defined 
expansion in terms of cross and auto-power spectra of velocity moments. The expansion parameter is 
roughly defined as kfiu/T-L, where u is related to a typical gravitational velocity of the system (which 
should be of the order of hundreds of kni/s. but note that we take higher and higher powers of these 
velocities in the scries). The expansion is convergent if the expansion parameter is less than unity. 

In terms of perturbation theory there is a close, but not one to one, relation between the lowest 
order of perturbation theory and the order of the moment expansion. Assuming S and ku/T-L are 
of the same order, the lowest order of the contribution in terms of powers of power spectrum (i.e., 
quadratic in 5) is [L + L')/2 ii L + L' is even and L > 0, and (L + L' + l)/2 if odd and P > 0, while 
for i = it is P'/2 + 1 if L' even or (L' + l)/2 if L' odd, but of course all higher order terms also 
enter. 

These equations also show that there is a close relation between the order of the moments 
and their angular dependence. To understand the angular dependence we first note that associated 
Legendre polynomial P™(/i) contains powers from 1 to /^'~™ for even and from n to for odd 

and is always multiplied with a power of (1 — /i^)™/^. Thus P;^;'/^ ''"(fc) gets multiplied with powers of 

(1 — /x^)"' or /x(l — /x^)™ to /x'+' ^^'"(1 — ^^)™, so the highest order is . In addition we have 
dependence in equation 3.2, so the lowest contribution in powers of /.i to P'"^{k) is if P + P' is 

even or n^'^^ if P -I- P' is odd, and the highest is \ Thus for Poo(k) the only angular term 

is isotropic, for Poi(k) the only angular term is fp, ^'ii(k) and Po2(k) contain both fi^ and /i* etc. 
Note that only even powers of fj. enter in the final expression, as required by the symmetry. We now 
proceed to look in more detail at the lowest order terms. 
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3.1 Poo(k): the isotropic term 

At the lowest order in the expansion we have correlation of real space density Tj*^ ~ '5(k) with itself. 
Density is a scalar of rank 0, Po{fi) ~ 1. The power spectrum is isotropic -Poo(k) = PoQ"(fc). This 
term is just the real space power spectrum and of course does not have any fi dependence since it is 
independent of redshift space distortions. For small values of fi this term always dominates, and in 
the limit fi = the transverse power spectrum becomes the real power spectrum Poo(fc)- The real 
space power spectrum agrees with the linear one on large scales, Poo(fc) = flin(fc), slightly dips below 
the linear one around k ~ O.lh/Mpc, while on even smaller scales the nonlinear corrections cause it 
to increase over the linear one. 

3.2 Poi(k) 

At the next order in our expansion (not in perturbation theory PT, see below) we have correlations 
between the density ^^(k) = (5(k) and radial component of momentum density 7'||(k) = [(1 + J)it||](k). 

Momentum density can be decomposed into a scalar (m = 0) T^'^ and two vector (m = ±1) com- 
ponents T^'^^ , but only the scalar part correlates with the density Tq which is a scalar. Thus the 
only contribution comes from Po^l'°ik) oc {T°'°(k){Tl'"{k))*), 

Poi(k)=P°;i'°(fc)M, (3.6) 

where we used Piif^) = fJ,- 

The scalar mode of momentum can be obtained from the divergence of momentum and related 
to 5 using the continuity equation, which in terms of our quantities is 

+ ikT^'" = 0. (3.7) 

This is an exact relation (for conserved quantities), in the sense that the vector part of momentum 
does not contribute to it, since it vanishes upon taking the divergence (i.e., vector components are 
orthogonal to k and the dot product is zero) . 
From this we get that 

Poi(k) = -ifc A'^A-ii^') = — — ■ (3-8) 
The total contribution from this term to P^^(fc) is 

dr a ma 

This is an exact relation for dark matter, valid also in the nonlinear regime. It shows that this term 
can be obtained directly from the redshift evolution of the dark matter power spectrum Poo(fc)- On 
large scales it agrees with the linear theory predictions. If we write Poo{k) = I?(a)^Piin(fc), with D{a) 
the linear growth rate and / = d\nD/d\na, then we find Poi(k) = 2//z^Piin(fc). We thus see that 
this order is of the same order in PT as the first order Poo(fc), the well known Kaiser result [2]. On 
smaller scales we expect the term to deviate from the linear one, just as for Poo(fc). 

3.3 Pn(k) 

The next term is the correlation of the momentum density T|| (k) with itself. In this case the scalar 

(m = 0) T^'°{k) correlates with itself and the vector (m = ±1) components T^'^^{k) also correlate 
with itself, so both components of momentum contribute, 

Pn(k) = P^;^^'ik)[P^i^)]^ + P^:l'\k)[P,\^,)r. (3.10) 

In terms of the contribution to the redshift space power spectrum this gives 

Pff (k) = u-^k^^,^[Pl:l^\k)^^^ + Pi^;^^(fc)(i - m')]. (3.ii) 
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The scalar part of the momentum is the one that contributes to the continuity equation 3.7. In hnear 
perturbation theory only the scalar contribution is non-zero and P^'l''^{k) = /^Piin(fc). This term is 
also of linear order and collecting all terms at this order we obtain the usual expression [2] 

P{,-:,{k)^il + ffiYPUk)- (3.12) 

However, we sec from the above that there will be another contribution to both ji^ and terms 
from the vector part of momentum correlator P^^'/^'^(fc) cx {\Tl^ {k)\'^) , which comes in at the second 
order in power spectrum. This vector part is often called the vorticity part of the momentum. In 
general this term is non-zero because vorticity of momentum does not vanish, even if vorticity of 
velocity vanishes for a single streamed fluid [16]. As seen from equation 3.11 this term always adds 
power to term and subtracts it in term (but is combined with a positive contribution from the 
scalar part in /x^ term). 

3.4 Po2(k) 

At orders higher than Pii(k) we no longer have any linear contributions, hence these terms are usually 
not of interest for extracting the cosmological information. However, these terms, including what is 
sometimes called the Fingers-of-God (FoG) effect, are known to be important on fairly large scales. 
Here we will limit the discussion to some general statements of their k and /i dependence, leaving 
their more precise calculations to future work. 

There are two different terms that contribute to this term, 

Po2(k) = Po"i'"(fc)[P°(M)]^ + Po"2'"(fc)^o (a^)^2 (a^)- (3.13) 
In terms of the contribution to the redshift space power spectrum this gives 



^ol(k) = - 



kfi 



p'd■^k)+]■p',r{k){^^?-l) 



(3.14) 



The first term is the correlation between the isotropic part of the mass weighted square of velocity, 
i.e. the energy density, Tq = (1 + 5)v? and the density field Tg '° ~ 5. The second term comes from 
the scalar part of the anisotropic stress Tj^'" correlated with the density Tq '"^ = 5. 

On physical grounds we expect the first term to be large in systems with a large rms velocity 
resulting in a term scaling as PQ'g'°(fc) ^ Poo(fc)o'^, where cr^ has units of velocity squared, but is 
not simply the volume averaged velocity squared (see below). The contribution of this term to P*'' 
goes as — (fc/x/'H)^(T^Poo(fc), i.e. it is a damping term suppressing the linear power spectrum, with 
the efl[ect increasing towards higher k (smaller scales). This is the lowest order FoG term, which we 
see contributes as (kjji)^ dependence and so affects the /x^ term. It is a damping term that is always 
negative, while the corresponding /x^ term from Pn always adds power. The scalar anisotropic stress- 
density correlator Pq '2 '°(fc) also contributes to /i^ angular term, as well as to angular term, and is 
formally of the same order in perturbation theory as Pq'q '°(fc), but is likely to be smaller on physical 
grounds that velocity dispersion in virialized objects is isotropic and hence has a small anisotropic 
stress. 



3.5 Po3(k), Pi2(k), Po4(k), Pi3(k) and P22(k) 

Since the lowest order in /x is (L + L') or L + L' + 1, the terms of order higher than Po2(k) do not 
contribute to /i^ term. At the next order in ji we have terms of order /x^. At this order there are 
7 terms that contribute. Pii(k) and Po2(k) we already discussed: while Pii(k) has a linear order 
term and is expected to dominate on large scales, Po2(k) is second order in power spectrum. They 
both come with a prefactor of fc^. At the next order we have Po3(k) and Pi2(k), both second order 
in power spectrum and each multiplied by k^ , followed by Pi3(k) and P22(k), also second order in 
power spectrum, but each multiplied by fc^, and by Po4(k), third order in power spectrum. All of 
these terms also contribute to terms of higher order in jj?^ , up to /i^ or /x^ for these terms. 
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One can see from this discussion that the angular structure of higher order terms is considerably 
more complex than that of lower order terms and that all the terms even in powers of ^ are being 
generated by RSD. However, there is a connection between the angular order, powers of k and lowest 
order in perturbation theory, such that only the low powers of k and low lowest order of PT contribute 
to the lowest orders in fi. Thus, at low values of kfi, the scries is convergent. To make these statements 
more quantitative a numerical or pcrturbativc analysis is required, which will be presented elsewhere 
[17]. 

3.6 Shot noise and connections between the correlators 

The correlators at the same order in powers of velocity, i.e. equal L + L' , contain nontrivial cancel- 
lations among them. To see this assume velocity is constant over a region of space r ^ k^^. For 
example, large scale bulk flows lead to correlated velocities on small scales, giving rise to nearly equal 
velocities between nearby particles. On scales smaller than this velocity coherence scale, fc > fco, we 
can pull out these constant velocity terms from the correlators, to obtain 

PLL'{k)SD{k - k') = ([(1 + S)ul]{k)[{l + S)ul]*{k')) ^ Poo(fc)(4+''VD(k - k'), (3.15) 

where (^^i^^^ ) is just a number corresponding to the spatial average of this term. So these terms are 
all equal as long a.s L + L' is the same. 

These terms enter into the sum in equation 3.1 with different pref actors and opposite signs, 
leading to cancellations between them. The lowest order example is that of Pn and P02, which enter 
with equal prefactors but opposite signs, canceling any such contributions from each other. This is 
not surprising: bulk flows lead to "rigid body" displacements of particles but do not contribute to 
FoG effects, so their contribution to P02 must be canceled. As a result, only velocity dispersion type 
contributions lead to FoG effects. 

In the extreme case this argument can be applied to the shot noise for these correlators, which 
is the contribution to the power spectrum caused by discreteness of tracers. It is well known that 
the shot noise of a density field sampled by tracers of number density n is given by Poo(fc) = n^^. 
Analogous calculation for the moments gives 

PLL'(fc) = n-'(4+^'). (3.16) 

This expression is exact, since by definition a discrete tracer population only has a single value of 
velocity at any given position, so {u^^^ ) will be the same for any pair of L,L' such that L + L' is 
the same. These shot noise terms can be large if the tracer is sparse, i.e. if n is small. However, the 
argument above shows that these terms enter with opposiste signs in the final result and so these shot 
noise contributions cancel in the total sum of 3.1. This is expected: the only shot noise contribution 
to the total RSD power spectrum P^^{k) should be fi~^. These examples show that these velocity 
moments are connected, and it is more natural to consider them together, such as Pii(fc) — Po2(fc), 
where the shot noise and the bulk flow terms cancel out. 

3.7 Relation to Legendre moments 

In RSD analyses it is customary to integrate P^*(k) over the lowest order Legendre polynomials to 
obtain moments Pf'^{k), 

Prik) - {21 + 1) P-(k)P,(^)d/i, (3.17) 
Jo 

where Pi{iJ.) are the ordinary Legendre polynomials, Po{fi) = 1, P2(a*) = (3At^ — l)/2 and Piin) — 
(35yLt* — 30/x^ + 3) /8. Only the lowest 3 orders contain contributions from linear terms, so the analysis 
is usually limited to I = 0,2,4. The moment is just the spherical average of the power spectrum 
in redshift space. The advantage of this expansion is that in a typical survey the moments are 
uncorrelated on scales small compared to the size of the survey. 

Moments in even / can be viewed as an alternative way to expand in terms of even powers of 
^. However, the expansion given in equation 3.4 is not an expansion in Legendre polynomials, since 
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it contains products of associated Legendre polynomials (including squares of ordinary Legendre 
polynomials of both even and odd orders). Hence there is no orthogonality between the moments 
of distribution function and Legendre moments Pf''^{k). So if we expand the angular dependence of 
any given term Pl%i{k) into Legendre polynomials, we will generate all orders up to Z = 2(L + L'). 
This means for example that all terms will contribute to the monopole / = 0, all except PJg(A:) to 
quadrupole I = 2 and all but f'oo (fc) and PiQ{k) to hcxadccupolc I = i. As a result we always have an 
infinite number of terms PLL/(k) contributing to any given Legendre series term P*^(fc), with higher 
and higher powers of k. This expansion is thus considerably more complex than the expansion in 
powers of /i^^ . which has a finite number of terms for any given value of j. 

The discussion above suggests it may be more beneficial to fit for powers of /i^-* rather than for 
Legendre moments, for example by fitting for /i*^, /i^ and fi'^ terms, which contain linear order contri- 
butions, together with higher order terms fi^, fi^ etc., which we do not care for and can marginalize 
over in the end. However, Legedre moments are uncorrelated while powers of /i^-' are strongly corre- 
lated, so a marginalization over higher order terms will lead to a large increase in errors for higher 
k, given that these terms become very large at high k compared to lowe order terms. So this can 
only work if sufficiently strong priors are adopted for higher order terms /i^, etc. Such priors could 
come from simulations extracting individual higher order terms or from a parametrized model. This 
is pursued further in [17]. 

3.8 Applications to galaxies and issues of bias 

The relation to other tracers such as galaxies is a rich subject worth exploring further with this method. 
In this paper we focus primarily on the dark matter, but all the derivations remain unchanged if the 
dark matter particles are replaced with some other tracers, such as galaxies or halos. In large scale 
structure we usually define bias as the ratio of galaxy power spectrum (shot noise subtracted) to 
matter power spectrum, 6^(fc) = Pqq (fc)/PJ[J™(fc). We can generalize the concept of bias to 

= (3.18) 

where Pf£,(k) is galaxy correlator and P|J™(k) is the corresponding dark matter term. In linear 
theory we have 6oo = ^ii ^oi = and bn = 1, independent of scale or angle, where linear bias 
is defined as Sg = biSm- Two ways to extract cosmological information from RSD are either by 
combining Pqo and Pqi to eliminate 6i, or to use Pn directly. 

Before discussing RSD further it is useful to draw a comparison to weak lensing. In case of weak 
lensing we can measure both projected dark matter density or galaxy density, so we can perform a 
joint correlation analysis of galaxy clustering and weak lensing, where the galaxy auto-correlation is 
proportional to 6^ times matter correlations, cross-correlation between galaxies and weak lensing signal 
around them induced by the dark matter is proportional to b (the so called galaxy-galaxy lensing), 
while the weak lensing auto-correlation is independent of bias. Two ways to extract the signal are 
either using just shear-shear correlations tracing matter-matter correlations, or combining galaxy 
auto-correlation with galaxy-galaxy lensing to eliminate bias. This latter has higher signal to noise 
but is complicated due to the fact that bias is scale independent and the scale dependence depends 
on the galaxy properties [18] . To understand when this happens it is useful to expand galaxy density 
perturbation to second order in matter density, Sg = biSm + b2S^- The second order terms will become 
important when they cannot be neglected against the first order terms, so the expansion parameter 
is {b2/bi)Sm- Since 5™ increases on small scales this scale dependent bias increases towards small 
scales. Typically we have I62/&1I < 0.4 [18] and the corrections become important at A: ^ O.lh/Mpc, 
where J™'' - 0.5. 

Returning back to RSD, our formalism is directly applicable to galaxies, except that all the 
velocity moments are mass weighted for the dark matter, T^''^ = (1 + Sm)u^, and number density 

weighted for the galaxies, T^^'^ = (1 + Sg)u^. What this shows is that if the density distribution 
of galaxies differs from that of the dark matter then all the correlators of velocity moments will 
differ from each other, even those that appear independent of bias, such as Pn. In reality thus the 
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predictions of linear bias model will be modified, because even if galaxies are faithful tracers of the 
dark matter velocities at a given position, the weighting of the velocity moments differs: in one case 
they are weighted by the dark matter mass, in the other by the number of galaxies, and the two differ 
in their spatial distribution. This will result in scale dependence of the higher order bias terms ^ll', 
just like it does for the 600 itself [19]. 

To quantify this further, for the lowest order momentum density term and for linear bias we must 
compare correlation of (1 + biSm)u^^ with itself to give Pff (k) or with biSm to give Pgf (k). The auto- 
correlation will give the result that agrees with the dark matter only for 61 = 1, or if biSm <C 1. In the 
same limit the cross-correlation will give linear bias 61 . So momentum density becomes velocity in the 
limit bi6m ^ 1, while requiring it to be scale independent relative to dark matter requires something 
like (5i — l)(S„i <C 1, which for typical LRG galaxies (61 ^ 2) is in fact a more stringent requirement 
than that of a scale independent bias condition discussed above versus bi). This suggests that 

the scale dependence of the momentum density bias terms box and 611 defined in equation 3.18 extends 
to larger scales than scale dependent bias of density 5oo- 

The conclusion from this discussion is that the scale dependence of bias terms involving momen- 
tum density is a real concern in RSD and likely extends to relatively large scales {k < Q.lh/Mpc). In 
terms of the angular decomposition in powers of ^, the discussion of the scale dependent bias of RSD 
can be divided into a /i^ term, which depends entirely on 601 in Poi, a-nd the /i^ term, for which the 
scale dependence of 611 term above is applicable, since that is the term that does not vanish on large 
scales in linear theory. In this sense RSD analysis is not the equivalent of a joint galaxy-weak lensing 
analysis, since weak lensing auto-correlation truly traces the dark matter directly, while in RSD this 
limit is achieved only on relatively large scales where (5™** ^ 1. 

The discussion so far completely ignored FoG effects: for /i^ term these are encoded in P02 and 
in vector part of Pn, which unlike P02 adds power rather than removes it, and these terms have their 
own physical interpretation and scale dependence unrelated to the scale dependent bias discussion 
above. While they can partially cancel the effects discussed above they are unlikely to achieve this 
exactly. In most of the literature so far only the scale dependence induced by FoG effects was discussed 
(although see [7, 12]). The simple linear bias model predicts that FoG effects scale with bias squared: 
the leading order term scales as bf both in PqJ' cx ([6i(5v^](k)&i(5(— k) and in Pff oc [biSv\\]'^. If we 
write Ppf (fc) - Pff (fc) = Pg^if (fc)cr2, then a is independent of bias, since P^l^ik) oc 6^Po^™(fc). 

4 Discussion 

In this paper we present a distribution function approach to redshift space distortions. We show that 
the redshift space density can be expressed in terms of a sum over velocity moments and the redshift 
space power spectrum can be expressed in terms of correlators between the Fourier components of 
these moments. These moments are simple objects to calculate in any system: they are calculated by 
simply taking appropriate powers of radial velocity and summing over all particles. The lowest order 
moments are density, momentum density, stress energy density etc. 

We have decomposed the moments into helicity eigenstates based on their transformation prop- 
erties under rotation around the direction of the Fourier mode, a generalization of SVT decomposition 
in cosmological perturbation theory. We use rotational invariance to derive all of the allowed corre- 
lator terms, showing that only terms with the same helicity can contribute to the correlators. The 
moments of distribution function are complicated objects with many terms allowed by symmetries, 
specially at higher order, leading to a complicated angular and scale dependence, suggesting that 
treatments of RSD cannot be fully successful with simple ansatzes, such as the popular FoG velocity 
dispersion model with one free parameter [10, 11]. 

Despite the complexity of the general RSD description some general statements can be made. 
The lower order terms generally only contribute to low orders of expansion in jj/^, where fj, is the angle 
between the Fourier mode and the line of sight. As an example, we have shown that only the scalar 
part of the momentum density correlates with the density, and this term can be written in terms of 
a time derivative of the power spectrum. This term only contributes to fj,^ angular dependence and 
contains a linear order term. But there is also the vector part of the momentum density-momentum 
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density correlation, the (scalar) energy density-density correlation, and the scalar part of anisotropic 
stress density-density correlation, all of which also contribute to the fj.^ term. They are all nonlinear 
and cannot dominate on very large scales, but likely dominate on small scales. The energy density- 
density correlation term is the term most closely related to the FoG velocity dispersion effect and 
is always negative, suppressing the power, but the other terms are formally of the same order in 
perturbation theory. We have shown that the vorticity part of momentum always adds to the RSD 
power of term, and hence acts in the opposite direction to the FoG term. Our analysis cannot 
address which term has a larger amplitude, but it would be interesting to see if there are any systems 
where the terms that add power dominate over those that suppress it. The next angular term has /i^ 
dependence and we identified 7 terms that contribute to it, of which one. scalar part of Pu, contains 
a linear contribution that does not vanish on large scales. 

The fact that there are a finite number of velocity moment terms at each order of expansion 
should be contrasted to the popular Legendre multipoles expansion (monopole, quadrupole and hex- 
adecupole contain cosmological information), which receive contributions from all orders in moments 
of distribution function. This suggests that a better behaved analysis may be possible if instead of 
a multipole analysis the analysis is performed in terms of a fj?^ expansion, with the lowest 3 orders 
containing cosmological information and the rest treated as nuisance parameters. 

It is important to emphasize that these moments are mass weighted quantities, and no volume 
averaged quantities ever enter into our expressions. This relates to one of the long standing issues 
in the treatment of RSD: many of the past treatments [10, 11, 20] have assumed that RSD trace 
correlations between velocities and dark matter and that the FoG effects multiply these density- 
velocity and velocity-velocity correlations, where FoG quantities are also defined as volume weighted 
quantities such as velocity dispersion ct^ = (m^). But these volume weighted quantities are not well 
defined, specially for sparse biased systems such as galaxies or clusters. For a biased tracer with b > 1 
one finds that voids with no tracers in them are enlarged, since, for 5 < 0, 1 -I- bS is closer to than 
1 + S. This has forced some workers to use the dark matter velocity field instead, with unpredictable 
results [11]. Our expansion shows that it is more natural to define RSD in terms of mass or number 
weighted quantities, such as momentum density or energy density, the former replacing velocity and 
the latter replacing velocity dispersion. Mass and number weighted moments such as momentum or 
energy density are well defined even in voids (where they are simply zero) . In this paper we show that 
there is a consistent expansion using mass weighted moments, and that the expansion is convergent 
on large scales. 

The fact that all RSD quantities are density weighted also suggests that RSD effects will differ 
if the galaxy number density distribution differs from mass density distribution. We have shown that 
even a linear bias model induces scale dependent bias of the momentum density correlators, and that 
this scale dependence is likely to show up on relatively large scales, k < O.lh/Mpc. The success of 
RSD in extracting cosmological information depends entirely on our ability to model these various 
bias terms and relate them to each other. Similarly, the success of the approach presented here in 
modeling RSD depends on our ability to extract these moments from simulations and data and on 
our ability to model them with analytic models, such as perturbation theory. Providing physical 
interpretation of the terms, as done here, could enable one to develop more effective modeling, or 
provide a better physical understanding of limitations of RSD in extracting cosmological information. 
For example, it is relatively straight-forward to include the bias induced scale dependence effect at 
the lowest order of FT and we will present the results elsewhere [21]. In this paper we have focused on 
theory, conceptual issues and general symmetries, while applications to simulations and perturbation 
theory will be presented in upcoming work [17, 21]. 
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